Pricing

$1.00 / 1,000 url reads

URL Metadata & OpenGraph Extractor

Reads a page's own public head tags, OpenGraph, Twitter card, title, description, canonical, favicon, and language, for clean link previews and RAG ingestion. Respects robots.txt by default. Billed only per URL successfully read.

Pricing

$1.00 / 1,000 url reads

Rating

0.0

(0)

Developer

Pono Data

Actor stats

Bookmarked

Total users

Monthly active users

3 hours ago

Last modified

Input

URLs: one per line.
Respect robots.txt: when on (default), the host's robots.txt is checked and disallowed URLs are skipped.
Max delivered URLs: cap on billed rows (0 = no cap).

Output

One row per URL: url, finalUrl, httpStatus, title, description, canonical, the og* fields, the twitter* fields, favicon, lang, plus provenance (sourceUrl, retrievedAt, confidence, dataSource).

How it works

Sites publish these head tags specifically so other tools can render previews. The actor fetches each page politely with a declared User-Agent, reads only the head, and copies the tags verbatim. Relative og:image, canonical, and favicon URLs are resolved to absolute against the page URL; nothing else is transformed, and a tag the page does not declare is null, never invented. A URL that robots disallows, or that fails to fetch, is written to the free rejected dataset and is not billed. A site owner can ask us to skip their domain at https://ponodata.com/opt-out ; opted-out hosts are skipped and never charged.

Billing

Pay per URL successfully read. Robots-disallowed and failed URLs cost nothing.

Sample output

A real run reading each page's own public head tags (one row per URL):

URL	title	description	OG type
https://www.cloudflare.com	Cloudflare: Build for the…	Welcome to Cloudflare - Powering …	website
https://stripe.com	Stripe / Financial Infras…	Stripe is a financial services pl…	website
https://www.python.org	Welcome to Python.org	The official home of the Python P…	website
https://kubernetes.io	Kubernetes	Kubernetes, also known as K8s, is…	website

Every row carries a sourceUrl (the page read), for example https://www.cloudflare.com. Pages that return no metadata route to the free reject dataset.

URL Metadata & OpenGraph Extractor

Input

Output

How it works

Billing

Sample output

See also

Domain WHOIS via RDAP: Clean Structured Records

Sitemap Extractor: Every URL, Recursive, Reliable

ZIP / Postal Code to Geo (City, State, Lat/Lon)

URL Metadata Extractor - OG Tags, Twitter Cards, Favicons

URL Metadata Scraper - OG, Twitter, JSON-LD

News & Brand Monitor - Alerts + AI Briefings

Metascraper — Web Metadata Extractor

Maigret Username OSINT Search

Website Metadata Extractor(sitemap, socialLinks, robotsTxt)

Advanced Ebay Item & Store Scraper

Universal Web Scraper - Extract Any URL

URL Metadata & OpenGraph Extractor

Input

Output

How it works

Billing

Sample output

See also

You might also like

Domain WHOIS via RDAP: Clean Structured Records

Sitemap Extractor: Every URL, Recursive, Reliable

ZIP / Postal Code to Geo (City, State, Lat/Lon)

URL Metadata Extractor - OG Tags, Twitter Cards, Favicons

URL Metadata Scraper - OG, Twitter, JSON-LD

News & Brand Monitor - Alerts + AI Briefings

Metascraper — Web Metadata Extractor

Maigret Username OSINT Search

Website Metadata Extractor(sitemap, socialLinks, robotsTxt)

Advanced Ebay Item & Store Scraper

Universal Web Scraper - Extract Any URL